High speed non-concurrency controlled database

ABSTRACT

Embodiments of the present invention provide a method and system for high-speed database searching with concurrent updating, without the use of database locks or access controls, for large database systems. Specifically, a plurality of search queries may be received over a network, the database may be searched, and a plurality of search replies may be sent over the network. While searching the database, new information received over the network may be incorporated into the database by creating a new element based on the new information and writing a pointer to the new element to the database using a single uninterruptible operation.

CLAIM FOR PRIORITY/CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This non-provisional application claims the benefit of U.S.Provisional Patent Application Serial No. 60/330,842, filed Nov. 1,2001, which is incorporated by reference in its entirety, and U.S.Provisional Patent Application Serial No. 60/365,169, filed Mar. 19,2002, which is incorporated by reference in its entirety. Thisapplication is related to U.S. Non-Provisional patent application Ser.Nos. [Att'y Dkt 12307/100179], [Att'y Dkt 12307/100180], [Att'y Dkt12307/100181] and [Att'y Dkt 12307/100182].

TECHNICAL FIELD

[0002] This disclosure relates to computer systems. More specifically,this disclosure relates to a method and system for providing high-speeddatabase searching with concurrent updating, without the use of databaselocks or access controls, for large database systems.

BACKGROUND OF THE INVENTION

[0003] As the Internet continues its meteoric growth, scaling domainname service (DNS) resolution for root and generic top level domain(gTLD) servers at reasonable price points is becoming increasinglydifficult. The A root server (i.e., a.root-server.net) maintains anddistributes the Internet namespace root zone file to the 12 secondaryroot servers geographically distributed around the world (i.e.,b.root-server.net, c.root-server.net, etc.), while the correspondinggTLD servers (i.e., a.gtld-servers.net, b.gtld-servers.net, etc.) aresimilarly distributed and support the top level domains (e.g., *.com,*.net, *.org, etc.). The ever-increasing volume of data coupled with theunrelenting growth in query rates is forcing a complete rethinking ofthe hardware and software infrastructure needed for root and gTLD DNSservice over the next several years. The typical single serverinstallation of the standard “bind” software distribution is alreadyinsufficient for the demands of the A root and will soon be unable tomeet even gTLD needs. With the convergence of the public switchedtelephone network (PSTN) and the Internet, there are opportunities for ageneral purpose, high performance search mechanism to provide featuresnormally associated with Service Control Points (SCPs) on the PSTN's SS7signaling network as new, advanced services are offered that span thePSTN and the Internet, including Advanced Intelligent Network (AIN),Voice Over Internet Protocol (VOIP) services, geolocation services, etc.

BRIEF DESCRIPTION OF THE DRAWINGS

[0004]FIG. 1 is a system block diagram, according to an embodiment ofthe present invention.

[0005]FIG. 2 is a detailed block diagram that illustrates a message datastructure, according to an embodiment of the present invention.

[0006]FIG. 3 is a detailed block diagram that illustrates a messagelatency data structure architecture, according to an embodiment of thepresent invention.

[0007]FIG. 4 is a detailed block diagram that illustrates anon-concurrency controlled data structure architecture, according to anembodiment of the present invention.

[0008]FIG. 5 is a detailed block diagram that illustrates anon-concurrency controlled data structure architecture, according to anembodiment of the present invention.

[0009]FIG. 6 is a detailed block diagram that illustrates anon-concurrency controlled data structure architecture, according to anembodiment of the present invention.

[0010]FIG. 7 is a detailed block diagram that illustrates anon-concurrency controlled data structure architecture, according to anembodiment of the present invention.

[0011]FIG. 8 is a detailed block diagram that illustrates anon-concurrency controlled data structure architecture, according to anembodiment of the present invention.

[0012]FIG. 9 is a top level flow diagram that illustrates a method forsearching and concurrently updating a database without the use ofdatabase locks or access controls, according to an embodiment of thepresent invention.

DETAILED DESCRIPTION

[0013] Embodiments of the present invention provide a method and systemfor high-speed database searching with concurrent updating, without theuse of database locks or access controls, for large database systems.Specifically, a plurality of search queries may be received over anetwork, the database may be searched, and a plurality of search repliesmay be sent over the network. While searching the database, newinformation received over the network may be incorporated into thedatabase by creating a new element based on the new information and,without locking the database, writing a pointer to the new element tothe database using a single uninterruptible operation.

[0014]FIG. 1 is a block diagram that illustrates a system according toan embodiment of the present invention. Generally, system 100 may host alarge, memory-resident database, receive search requests and providesearch responses over a network. For example, system 100 may be asymmetric, multiprocessing (SMP) computer, such as, for example, an IBMRS/6000® M80 or S80 manufactured by International Business MachinesCorporation of Armonk, New York, a Sun Enterprise™ 10000 manufactured bySun Microsystems, Inc. of Santa Clara, Calif., etc. System 100 may alsobe a multi-processor personal computer, such as, for example, a CompaqProLiant™ ML530 (including two Intel Pentium® III 866 MHz processors)manufactured by Hewlett-Packard Company of Palo Alto, Calif. System 100may also include a multiprocessing operating system, such as, forexample, IBM AIX® 4, Sun Solaris™ 8 Operating Environment, Red HatLinux® 6.2, etc. System 100 may receive periodic updates over network124, which may be concurrently incorporated into the database.Embodiments of the present invention may achieve very high databasesearch and update throughput by incorporating each update to thedatabase without the use of database locks or access controls.

[0015] In an embodiment, system 100 may include at least one processor102-1 coupled to bus 101. Processor 102-1 may include an internal memorycache (e.g., an L1 cache, not shown for clarity). A secondary memorycache 103-1 (e.g., an L2 cache, L2/L3 caches, etc.) may reside betweenprocessor 102-1 and bus 101. In a preferred embodiment, system 100 mayinclude a plurality of processors 102-1 . . . 102-P coupled to bus 101.A plurality of secondary memory caches 103-1 . . . 103-P may also residebetween plurality of processors 102-1 . . . 102-P and bus 101 (e.g., alook-through architecture), or, alternatively, at least one secondarymemory cache 103-1 may be coupled to bus 101 (e.g., a look-asidearchitecture). System 100 may include memory 104, such as, for example,random access memory (RAM), etc., coupled to bus 101, for storinginformation and instructions to be executed by plurality of processors102-1 . . . 102-P.

[0016] Memory 104 may store a large database, for example, fortranslating Internet domain names into Internet addresses, fortranslating names or phone numbers into network addresses, for providingand updating subscriber profile data, for providing and updating userpresence data, etc. Advantageously, both the size of the database andthe number of translations per second may be very large. For example,memory 104 may include at least 64 GB of RAM and may host a 500M (i.e.,500×10⁶) record domain name database, a 500M record subscriber database,a 450M record telephone number portability database, etc.

[0017] On an exemplary 64-bit system architecture, such as, for example,a system including at least one 64-bit big-endian processor 102-1coupled to at least a 64-bit bus 101 and a 64-bit memory 104, an 8-bytepointer value may be written to a memory address on an 8-byte boundary(i.e., a memory address divisible by eight, or, e.g., 8N) using asingle, uninterruptible operation. Generally, the presence of secondarymemory cache 103-1 may simply delay the 8-byte pointer write to memory104. For example, in one embodiment, secondary memory cache 103-1 may bea look-through cache operating in write-through mode, so that a single,8-byte store instruction may move eight bytes of data from processor102-1 to memory 104, without interruption, and in as few as two systemclock cycles. In another embodiment, secondary memory cache 1031 may bea look-through cache operating in write-back mode, so that the 8-bytepointer may first be written to secondary memory cache 103-1, which maythen write the 8-byte pointer to memory 104 at a later time, such as,for example, when the cache line in which the 8-byte pointer is storedis written to memory 104 (i.e., e.g., when the particular cache line, orthe entire secondary memory cache, is “flushed”).

[0018] Ultimately, from the perspective of processor 102-1, once thedata are latched onto the output pins of processor 102-1, all eightbytes of data are written to memory 104 in one contiguous, uninterruptedtransfer, which may be delayed by the effects of a secondary memorycache 103-1, if present. From the perspective of processors 102-2 . . .102-P, once the data are latched onto the output pins of processor102-1, all eight bytes of data are written to memory 104 in onecontiguous, uninterrupted transfer, which is enforced by the cachecoherency protocol across secondary memory caches 103-1 . . . 103-P,which may delay the write to memory 104 if present.

[0019] However, if an 8-byte pointer value is written to a misalignedlocation in memory 104, such as a memory address that crosses an 8-byteboundary, all eight bytes of data can not be transferred from processor102-1 using a single, 8-byte store instruction. Instead, processor 102-1may issue two separate and distinct store instructions. For example, ifthe memory address begins four bytes before an 8-byte boundary (e.g.,8N−4), the first store instruction transfers the four most significantbytes to memory 104 (e.g., 8N−4), while the second store instructiontransfers the four least significant bytes to memory 104 (e.g., 8N).Importantly, between these two separate store instructions, processor102-1 may be interrupted, or, processor 102-1 may loose control of bus101 to another system component (e.g., processor 102-P, etc.).Consequently, the pointer value residing in memory 104 will be invaliduntil processor 102-1 can complete the second store instruction. Ifanother component begins a single, uninterruptible memory read to thismemory location, an invalid value will be returned as a presumably validone.

[0020] Similarly, a new 4-byte pointer value may be written to a memoryaddress divisible by four (e.g., 4N) using a single, uninterruptibleoperation. Note that in the example discussed above, a 4-byte pointervalue may be written to the 8N−4 memory location using a single storeinstruction. Of course, if a 4-byte pointer value is written to alocation that crosses a 4-byte boundary, e.g., 4N−2, all four bytes ofdata can not be transferred from processor 102-1 using a single storeinstruction, and the pointer value residing in memory 104 may be invalidfor some period of time.

[0021] System 100 may also include a read only memory (ROM) 106, orother static storage device, coupled to bus 101 for storing staticinformation and instructions for processor 102-1. A storage device 108,such as a magnetic or optical disk, may be coupled to bus 101 forstoring information and instructions. System 100 may also includedisplay 110 (e.g., an LCD monitor) and input device 112 (e.g., keyboard,mouse, trackball, etc.), coupled to bus 101. System 100 may include aplurality of network interfaces 114-1 . . . 114-O, which may send andreceive electrical, electromagnetic or optical signals that carrydigital data streams representing various types of information. In anembodiment, network interface 114-1 may be coupled to bus 101 and localarea network (LAN) 122, while network interface 114-O may coupled to bus101 and wide area network (WAN) 124. Plurality of network interfaces114-1 . . . 114-O may support various network protocols, including, forexample, Gigabit Ethernet (e.g., IEEE Standard 802.3-2002, published2002), Fiber Channel (e.g., ANSI Standard X.3230-1994, published 1994),etc. Plurality of network computers 120-1 . . . 120-N may be coupled toLAN 122 and WAN 124. In one embodiment, LAN 122 and WAN 124 may bephysically distinct networks, while in another embodiment, LAN 122 andWAN 124 may be via a network gateway or router (not shown for clarity).Alternatively, LAN 122 and WAN 124 may be the same network.

[0022] As noted above, system 100 may provide DNS resolution services.In a DNS resolution embodiment, DNS resolution services may generally bedivided between network transport and data look-up functions. Forexample, system 100 may be a back-end look-up engine (LUE) optimized fordata look-up on large data sets, while plurality of network computers120-1 . . . 120-N may be a plurality of front-end protocol engines (PEs)optimized for network processing and transport. The LUE may be apowerful multiprocessor server that stores the entire DNS record set inmemory 104 to facilitate high-speed, high-throughput searching andupdating. In an alternative embodiment, DNS resolution services may beprovided by a series of powerful multiprocessor servers, or LUEs, eachstoring a subset of the entire DNS record set in memory to facilitatehigh-speed, high-throughput searching and updating.

[0023] Conversely, the plurality of PEs may be generic, low profile,PC-based machines, running an efficient multitasking operating system(e.g., Red Hat Linux® 6.2), that minimize the network processingtransport load on the LUE in order to maximize the available resourcesfor DNS resolution. The PEs may handle the nuances of wire-line DNSprotocol, respond to invalid DNS queries and multiplex valid DNS queriesto the LUE over LAN 122. In an alternative embodiment including multipleLUEs storing DNS record subsets, the PEs may determine which LUE shouldreceive each valid DNS query, and multiplex valid DNS queries to theappropriate LUEs. The number of PEs for a single LUE may be determined,for example, by the number of DNS queries to be processed per second andthe performance characteristics of the particular system. Other metricsmay also be used to determine the appropriate mapping ratios andbehaviors.

[0024] Generally, other large-volume, query-based embodiments may besupported, including, for example, telephone number resolution, SS7signaling processing, geolocation determination, telephonenumber-to-subscriber mapping, subscriber location and presencedetermination, etc.

[0025] In an embodiment, a central on-line transaction processing (OLTP)server 140-1 may be coupled to WAN 124 and receive additions,modifications and deletions (i.e., update traffic) to database 142-1from various sources. OLTP server 140-1 may send updates to system 100,which includes a local copy of database 142-1, over WAN 124. OLTP server140-1 may be optimized for processing update traffic in various formatsand protocols, including, for example, HyperText Transmission Protocol(HTTP), Registry Registrar Protocol (RRP), Extensible ProvisioningProtocol (EPP), Service Management System/800 Mechanized GenericInterface (MGI), and other on-line provisioning protocols. Aconstellation of read-only LUEs may be deployed in a hub and spokearchitecture to provide high-speed search capability conjoined withhigh-volume, incremental updates from OLTP server 140-1.

[0026] In an alternative embodiment, data may be distributed overmultiple OLTP servers 140-1 . . . 140-S, each of which may be coupled toWAN 124. OLTP servers 140-1 . . . 140-S may receive additions,modifications, and deletions (i.e., update traffic) to their respectivedatabases 142-1 . . . 142-S (not shown for clarity) from varioussources. OLTP servers 140-1 . . . 140-S may send updates to system 100,which may include copies of databases 142-1 . . . 142-S, otherdynamically-created data, etc., over WAN 124. For example, in ageolocation embodiment, OLTP servers 140-1 . . . 140-S may receiveupdate traffic from groups of remote sensors. In another alternativeembodiment, plurality of network computers 120-1 . . . 120-N may alsoreceive additions, modifications, and deletions (i.e., update traffic)from various sources over WAN 124 or LAN 122. In this embodiment,plurality of network computers 120-1 . . . 120-N may send updates, aswell as queries, to system 100.

[0027] In the DNS resolution embodiment, each PE (e.g., each of theplurality of network computers 120-1 . . . 120-N) may combine, ormultiplex, several DNS query messages, received over a wide area network(e.g., WAN 124), into a single Request SuperPacket and send the RequestSuperPacket to the LUE (e.g., system 100) over a local area network(e.g., LAN 122). The LUE may combine, or multiplex, several DNS querymessage replies into a single Response SuperPacket and send the ResponseSuperPacket to the appropriate PE over the local area network.Generally, the maximum size of a Request or Response SuperPacket may belimited by the maximum transmission unit (MTU) of the physical networklayer (e.g., Gigabit Ethernet). For example, typical DNS query and replymessage sizes of less than 100 bytes and 200 bytes, respectively, allowfor over 30 queries to be multiplexed into a single Request SuperPacket,as well as over 15 replies to be multiplexed into a single ResponseSuperPacket. However, a smaller number of queries (e.g., 20 queries) maybe included in a single Request SuperPacket in order to avoid MTUoverflow on the response (e.g., 10 replies). For larger MTU sizes, thenumber of multiplexed queries and replies may be increased accordingly.

[0028] Each multitasking PE may include an inbound thread and anoutbound thread to manage DNS queries and replies, respectively. Forexample, the inbound thread may un-marshal the DNS query components fromthe incoming DNS query packets received over a wide area network andmultiplex several milliseconds of queries into a single RequestSuperPacket. The inbound thread may then send the Request SuperPacket tothe LUE over a local area network. Conversely, the outbound thread mayreceive the Response SuperPacket from the LUE, de-multiplex the repliescontained therein, and marshal the various fields into a valid DNSreply, which may then be transmitted over the wide area network.Generally, as noted above, other large-volume, query-based embodimentsmay be supported.

[0029] In an embodiment, the Request SuperPacket may also include stateinformation associated with each DNS query, such as, for example, thesource address, the protocol type, etc. The LUE may include the stateinformation, and associated DNS replies, within the ResponseSuperPacket. Each PE may then construct and return valid DNS replymessages using the information transmitted from the LUE. Consequently,each PE may advantageously operate as a stateless machine, i.e., validDNS replies may be formed from the information contained in the ResponseSuperPacket. Generally, the LUE may return the Response SuperPacket tothe PE from which the incoming SuperPacket originated; however, othervariations may obviously be possible.

[0030] In an alternative embodiment, each PE may maintain the stateinformation associated with each DNS query and include a reference, orhandle, to the state information within the Request SuperPacket. The LUEmay include the state information references, and associated DNSreplies, within the Response SuperPacket. Each PE may then construct andreturn valid DNS reply messages using the state information referencestransmitted from the LUE, as well as the state information maintainedthereon. In this embodiment, the LUE may return the Response SuperPacketto the PE from which the incoming SuperPacket originated.

[0031]FIG. 2 is a detailed block diagram that illustrates a message datastructure, according to an embodiment of the present invention.Generally, message 200 may include header 210, having a plurality ofsequence number 211-1 . . . 211-S and a plurality of message counts212-1 . . . 212-S, and data payload 215.

[0032] In the DNS resolution embodiment, message 200 may be used forRequest SuperPackets and Response SuperPackets. For example, RequestSuperPacket 220 may include header 230, having a plurality of sequencenumber 231-1 . . . 231-S and a plurality of message counts 232-1 . . .232-S, and data payload 235 having multiple DNS queries 236-1 . . .236-Q, accumulated by a PE over a predetermined period of time, such as,for example, several milliseconds. In one embodiment, each DNS query236-1 . . . 236-Q may include state information, while in an alternativeembodiment, each DNS query 236-1 . . . 236-Q may include a handle tostate information.

[0033] Similarly, Response SuperPacket 240 may include header 250,having a plurality of sequence number 251-1 . . . 251-S and a pluralityof message counts 252-1 . . . 252-S, and data payload 255 havingmultiple DNS replies 256-1 . . . 256-R approximately corresponding tothe multiple DNS queries contained within Request SuperPacket 220. Inone embodiment, each DNS reply 256-1 . . . 256-R may include stateinformation associated with the corresponding DNS query, while in analternative embodiment, each DNS reply 256-1 . . . 256-R may include ahandle to state information associated with the corresponding DNS query.Occasionally, the total size of the corresponding DNS replies may exceedthe size of data payload 255 of the Response SuperPacket 240. Thisoverflow may be limited, for example, to a single reply, i.e., the replyassociated with the last query contained within Request SuperPacket 220.Rather than sending an additional Response SuperPacket 240 containingonly the single reply, the overflow reply may be preferably included inthe next Response SuperPacket 240 corresponding to the next RequestSuperPacket. Advantageously, header 250 may include appropriateinformation to determine the extent of the overflow condition. Underpeak processing conditions, more than one reply may overflow into thenext Response SuperPacket.

[0034] For example, in Response SuperPacket 240, header 250 may includeat least two sequence numbers 251-1 and 251-2 and at least two messagecounts 252-1 and 252-2, grouped as two pairs of complementary fields.While there may be “S” number of sequence number and message countpairs, typically, S is a small number, such as, e.g., 2, 3, 4, etc.Thus, header 250 may include sequence number 251-1 paired with messagecount 252-1, sequence number 251-2 paired with message count 252-2, etc.Generally, message count 252-1 may reflect the number of repliescontained within data payload 255 that are associated with sequencenumber 251-1. In an embodiment, sequence number 251-1 may be a two-bytefield, while message count 252-1 may be a one-byte field.

[0035] In a more specific example, data payload 235 of RequestSuperPacket 220 may include seven DNS queries (as depicted in FIG. 2).In one embodiment, sequence number 231-1 may be set to a unique value(e.g., 1024) and message count 232-1 may be set to seven, while sequencenumber 231-2 and message count 232-2 may be set to zero. In anotherembodiment, header 230 may contain only one sequence number and onemessage count, e.g., sequence number 231-1 and message count 232-1 setto 1024 and seven, respectively. Typically, Request SuperPacket 220 maycontain all of the queries associated with a particular sequence number.

[0036] Data payload 255 of Response SuperPacket 240 may include sevencorresponding DNS replies (as depicted in FIG. 2). In this example,header 250 may include information similar to Request SuperPacket 220,i.e., sequence number 251-1 set to the same unique value (i.e., 1024),message count 252-1 set to seven, and both sequence number 252-2 andmessage count 252-2 set to zero. However, in another example, datapayload 255 of Response SuperPacket 240 may include only fivecorresponding DNS replies, and message count 252-1 may be set to fiveinstead. The remaining two responses associated with sequence number1024 may be included within the next Response SuperPacket 240.

[0037] The next Request SuperPacket 240 may include a different sequencenumber (e.g., 1025) and at least one DNS query, so that the nextResponse SuperPacket 240 may include the two previous replies associatedwith the 1024 sequence number, as well as at least one reply associatedwith the 1025 sequence number. In this example, header 250 of the nextResponse SuperPacket 240 may include sequence number 251-1 set to 1024,message count 252-1 set to two, sequence number 251-2 set to 1025 andmessage count 252-2 set to one. Thus, Response SuperPacket 240 mayinclude a total of three replies associated with three queries containedwithin two different Request SuperPackets.

[0038]FIG. 3 is a detailed block diagram that illustrates a messagelatency data structure architecture, according to an embodiment of thepresent invention. Message latency data structure 300 may includeinformation generally associated with the transmission and reception ofmessage 200. In the DNS resolution embodiment, message latency datastructure 300 may include latency information about Request SuperPacketsand Response SuperPackets; this latency information may be organized ina table format indexed according to sequence number value (e.g., index301). For example, message latency data structure 300 may include anumber of rows N equal to the total number of unique sequence numbers,as illustrated, generally, by table elements 310, 320 and 330. In anembodiment, SuperPacket header sequence numbers may be two bytes inlength and define a range of unique sequence numbers from zero to 2¹⁶-1(i.e., 65,535). In this case, N may be equal to 65,536. Latencyinformation may include Request Timestamp 302, Request Query Count 303,Response Timestamp 304, Response Reply Count 305, and Response MessageCount 306. In an alternative embodiment, latency information may alsoinclude an Initial Response Timestamp (not shown).

[0039] In an example, table element 320 illustrates latency informationfor a Request SuperPacket 220 having a single sequence number 231-1equal to 1024. Request Timestamp 302 may indicate when this particularRequest SuperPacket was sent to the LUE. Request Query Count 303 mayindicate how many queries were contained within this particular RequestSuperPacket. Response Timestamp 304 may indicate when a ResponseSuperPacket having a sequence number equal to 1024 was received at thePE (e.g., network computer 120-N) and may be updated if more than oneResponse SuperPacket is received at the PE. Response Reply Count 305 mayindicate the total number of replies contained within all of thereceived Response SuperPackets associated with this sequence number(i.e., 1024). Response Message Count 306 may indicate how many ResponseSuperPackets having this sequence number (i.e., 1024) arrived at the PE.Replies to the queries contained within this particular RequestSuperPacket may be split over several Response SuperPackets, in whichcase, Response Timestamp 304, Response Reply Count 305, and ResponseMessage Count 306 may be-updated as each of the additional ResponseSuperPackets are received. In an alternative embodiment, the InitialResponse Timestamp may indicate when the first Response SuperPacketcontaining replies for this sequence number (i.e., 1024) was received atthe PE. In this embodiment, Response Timestamp 304 may be updated whenadditional (i.e., second and subsequent) Response SuperPackets arereceived.

[0040] Various important latency metrics may be determined from thelatency information contained within message latency data structure 300.For example, simple cross-checking between Request Query Count 303 andResponse Reply Count 305 for a given index 301 (i.e., sequence number)may indicate a number of missing replies. This difference may indicatethe number of queries inexplicably dropped by the LUE. Comparing RequestTimestamp 302 and Response Timestamp 304 may indicate how well theparticular PE/LUE combination may be performing under the currentmessage load. The difference between the current Request SuperPacketsequence number and the current Response SuperPacket sequence number maybe associated with the response performance of the LUE; e.g., the largerthe difference, the slower the performance. The Response Message Count306 may indicate how many Response SuperPackets are being used for eachRequest SuperPacket, and may be important in DNS resolution trafficanalysis. As the latency of the queries and replies travelling betweenthe PEs and LUE increases, the PEs may reduce the number of DNS querypackets processed by the system.

[0041] Generally, the LUE may perform a multi-threaded look-up on theincoming, multiplexed Request SuperPackets, and may combine the repliesinto outgoing, multiplexed Response SuperPackets. For example, the LUEmay spawn one search thread, or process, for each active PE and routeall the incoming Request SuperPackets from that PE to that searchthread. The LUE may spawn a manager thread, or process, to control theassociation of PEs to search threads, as well as an update thread, orprocess, to update the database located in memory 104. Each searchthread may extract the search queries from the incoming RequestSuperPacket, execute the various searches, construct an outgoingResponse SuperPacket containing the search replies and send theSuperPacket to the appropriate PE. The update thread may receive updatesto the database, from OLTP 140-1, and incorporate the new data into thedatabase. In an alternative embodiment, plurality of network computers120-1 . . . 120-N may send updates to system 100. These updates may beincluded, for example, within the incoming Request SuperPacket messagestream.

[0042] Accordingly, by virtue of the SuperPacket protocol, the LUE mayspend less than 15% of its processor capacity on network processing,thereby dramatically increasing search query throughput. In anembodiment, an IBM® 8-way M80 may sustain search rates of 180 k to 220 kqueries per second (qps), while an IBM® 24-way S80 may sustain 400 k to500 k qps. Doubling the search rates, i.e., to 500 k and 1M qps,respectively, simply requires twice as much hardware, i.e., e.g., twoLUEs with their attendant PEs. In another embodiment, a dual Pentium®III 866 MHz multi-processor personal computer operating Red Hat Linux®6.2 may sustain update rates on the order of 100K/sec. Of course,increases in hardware performance also increase search and update ratesassociated with embodiments of the present invention, and asmanufacturers replace these multiprocessor computers withfaster-performing machines, for example, the sustained search and updaterates may increase commensurately. Generally, system 100 is not limitedto a client or server architecture, and embodiments of the presentinvention are not limited to any specific combination of hardware and/orsoftware.

[0043]FIG. 4 is a block diagram that illustrates a general databasearchitecture according to an embodiment of the present invention. Inthis embodiment, database 400 may include at least one table or group ofdatabase records 401, and at least one corresponding search index 402with pointers (indices, direct byte-offsets, etc.) to individual recordswithin the group of database records 401. For example, pointer 405 mayreference database record 410.

[0044] In one embodiment, database 400 may include at least one hashtable 403 as a search index with pointers (indices, direct byte-offsets,etc.) into the table or group of database records 401. A hash functionmay map a search key to an integer value which may then be used as anindex into hash table 403. Because more than one search key may map to asingle integer value, hash buckets may be created using a singly-linkedlist of hash chain pointers. For example, each entry within hash table403 may contain a pointer to the first element of a hash bucket, andeach element of the hash bucket may contain a hash chain pointer to thenext element, or database record, in the linked-list. Advantageously, ahash chain pointer may be required only for those elements, or databaserecords, that reference a subsequent element in the hash bucket.

[0045] Hash table 403 may include an array of 8-byte pointers toindividual database records 401. For example, hash pointer 404 withinhash table 403 may reference database record 420 as the first elementwithin a hash bucket. Database record 420 may contain a hash chainpointer 424 which may reference the next element, or database record, inthe hash bucket. Database record 420 may also include a data length 421and associated fixed or variable-length data 422. In an embodiment, anull character 423, indicating the termination of data 422, may beincluded. Additionally, database record 420 may include a data pointer425 which may reference another database record, either within the groupof database records 401 or within a different table or group of databaserecords (not shown), in which additional data may be located.

[0046] System 100 may use various, well-known algorithms to search thisdata structure architecture for a given search term or key. Generally,database 400 may be searched by multiple search processes, or threads,executing on at least one of the plurality of processors 102-1 . . .102-P. However, modifications to database 400 may not be integrallyperformed by an update thread (or threads) unless the search thread(s)are prevented from accessing database 400 for the period of timenecessary to add, modify, or delete information within database 400. Forexample, in order to modify database record 430 within database 400, thegroup of database records 401 may be locked by an update thread toprevent the search threads from accessing database 400 while the updatethread is modifying the information within database record 430. Thereare many well-known mechanisms for locking database 400 to preventsearch access, including the use of spin-locks, semaphores, mutexes,etc. Additionally, various off-the-shelf commercial databases providespecific commands to lock all or parts of database 400, e.g., the locktable command in the Oracle 8 Database, manufactured by OracleCorporation of Redwood Shores, Calif., etc.

[0047]FIG. 5 is a block diagram that illustrates a general databasearchitecture according to another embodiment of the present invention.In this embodiment, database 500 may include a highly-optimized,read-only, master snapshot file 510 and a growing, look-aside file 520.Master snapshot file 510 may include at least one table or group ofdatabase records 511, and at least one corresponding search index 512with pointers (indices, direct byte-offsets, etc.) to individual recordswithin the group of database records 511. Alternatively, master snapshotfile 510 may include at least one hash table 513 as a search index withpointers (indices, direct byte-offsets, etc.) into the table or group ofdatabase records 511. Similarly, look-aside file 520 may include atleast two tables or groups of database records, including databaseaddition records 521 and database deletion records 531. Correspondingsearch indices 522 and 532 may be provided, with pointers (indices,direct byte-offsets, etc.) to individual records within the databaseaddition records 521 and database deletion records 531. Alternatively,look-aside file 520 may include hash tables 523 and 533 as searchindices, with pointers (indices, direct byte-offsets, etc.) intodatabase addition records 521 and database deletion records 531,respectively.

[0048] System 100 may use various, well-known algorithms to search thisdata structure architecture for a given search term or key. In a typicalexample, look-aside file 520 may include all the recent changes to thedata, and may be searched before read-only master snapshot file 510. Ifthe search key is found in look-aside file 520, the response is returnedwithout accessing snapshot file 510, but if the key is not found, thensnapshot file 510 may be searched. However, when look-aside file 520 nolonger fits in memory 104 with snapshot file 510, search query ratesdrop dramatically, by a factor of 10 to 50, or more, for example.Consequently, to avoid or minimize any drop in search query rates,snapshot file 510 may be periodically updated, or recreated, byincorporating all of the additions, deletions and modificationscontained within look-aside file 520

[0049] Data within snapshot file 510 are not physically altered butlogically added, modified or deleted. For example, data within snapshotfile 510 may be deleted, or logically “forgotten,” by creating acorresponding delete record within database deletion records 531 andwriting a pointer to the delete record to the appropriate location inhash table 533. Data within snapshot file 510 may be logically modifiedby copying a data record from snapshot file 510 to a new data recordwithin database addition records 521, modifying the data within the newentry, and then writing a pointer to the new entry to the appropriatehash table (e.g., hash table 522) or chain pointer within databaseaddition records 521. Similarly, data within snapshot file 510 may belogically added to snapshot file 510 by creating a new data recordwithin database addition records 521 and then writing a pointer to thenew entry to the appropriate hash table (e.g., hash table 522) or chainpointer within database addition records 521.

[0050] In the DNS resolution embodiment, for example, snapshot file 510may include domain name data and name server data, organized as separatedata tables, or blocks, with separate search indices (e.g., 511-1,511-2, 512-1, 512-2, 513-1, 513-2, etc., not shown for clarity).Similarly, look-aside file 520 may include additions and modificationsto both the domain name data and the name server data, as well asdeletions to both the domain name data and the name server data (e.g.,521-1, 521-2, 522-1, 522-2, 523-1, 523-2, 531-1, 531-2, 532-1, 532-2,533-1, 533-2, etc., not shown for clarity).

[0051]FIG. 6 is a detailed block diagram that illustrates anon-concurrency controlled data structure architecture, according to anembodiment of the present invention. Generally, database 600 may beorganized into a single, searchable representation of the data. Data setupdates may be continuously incorporated into database 600, and deletesor modifications may be physically performed on the relevant databaserecords to free space within memory 104, for example, for subsequentadditions or modifications. The single, searchable representation scalesextremely well to large data set sizes and high search and update rates,and obviates the need to periodically recreate, propagate and reloadsnapshot files among multiple search engine computers.

[0052] In a DNS resolution embodiment, for example, database 600 mayinclude domain name data 610 and name server data 620. Domain name data610 and name server data 620 may include search indices with pointers(indices, direct byte-offsets, etc.) into blocks of variable lengthrecords. As discussed above, a hash function may map a search key to aninteger value which may then be used as an index into a hash table.Similarly, hash buckets may be created for each hash table index using asingly-linked list of hash chain pointers. Domain name data 610 mayinclude, for example, a hash table 612 as a search index and a block ofvariable-length domain name records 611. Hash table 612 may include anarray of 8-byte pointers to individual domain name records 611, such as,for example, pointer 613 referencing domain name record 620.Variable-length domain name record 620 may include, for example, a nextrecord offset 621, a name length 622, a normalized name 623, a chainpointer 624 (i.e., e.g., pointing to the next record in the hash chain),a number of name servers 625, and a name server pointer 626. The size ofboth chain pointer 624 and name server pointer 626 may be optimized toreflect the required block size for each particular type of data, e.g.,eight bytes for chain pointer 624 and four bytes for name server pointer626.

[0053] Name server data 630 may include, for example, a hash table 632as a search index and a block of variable-length name server records631. Hash table 632 may include an array of 4-byte pointers toindividual name server records 631, such as, for example, pointer 633referencing name server record 640. Variable-length name server record640 may include, for example, a next record offset 641, a name length642, a normalized name 643, a chain pointer 644 (i.e., e.g., pointing tothe next record in the hash chain), a number of name server networkaddresses 645, a name server address length 646, and a name servernetwork address 647, which may be, for example, an Internet Protocol(IP) network address. Generally, name server network addresses may bestored in ASCII (American Standard Code for Information Interchange,e.g., ISO-14962-1997, ANSI-X3.4-1997, etc.) or binary format; in thisexample, name server network address length 646 indicates that nameserver network address 647 is stored in binary format (i.e., fourbytes). The size of chain pointer 644 may also be optimized to reflectthe required name server data block size, e.g., four bytes.

[0054] Generally, both search indices, such as hash tables, andvariable-length data records may be structured so that 8-byte pointersare located on 8-byte boundaries in memory. For example, hash table 612may contain a contiguous array of 8-byte pointers to domain name records611, and may be stored at a memory address divisible by eight (i.e., an8-byte boundary, or 8N). Similarly, both search indices, such as hashtables, and variable-length data records may be structured so that4-byte pointers are located on 4-byte boundaries in memory. For example,hash table 632 may contain a contiguous array of 4-byte pointers to nameserver records 631, and may be stored at a memory address divisible byfour (i.e., a 4-byte boundary, or 4N). Consequently, modifications todatabase 600 may conclude by updating a pointer to an aligned address inmemory using a single uninterruptible operation, including, for examplewriting a new pointer to the search index, such as a hash table, orwriting a new hash chain pointer to a variable-length data record.

[0055]FIG. 7 is a detailed block diagram that illustrates anon-concurrency controlled data structure architecture, according to anembodiment of the present invention. Generally, database 700 may also beorganized into a single, searchable representation of the data. Data setupdates may be continuously incorporated into database 700, and deletesor modifications may be physically performed on the relevant databaserecords to free space within memory 104, for example, for subsequentadditions or modifications. The single, searchable representation scalesextremely well to large data set sizes and high search and update rates,and obviates the need to periodically recreate, propagate and reloadsnapshot files among multiple search engine computers.

[0056] Many different physical data structure organizations arepossible. An exemplary organization may use an alternative search indexto hash tables for ordered, sequential access to the data records, suchas the ternary search tree (trie), or TST, which combines the featuresof binary search trees and digital search tries. In a text-basedapplications, such as, for example, whois, domain name resolution usingDNS Secure Extensions (Internet Engineering Taskforce Request forComments: 2535), etc. TSTs advantageously minimize the number ofcomparison operations required to be performed, particularly in the caseof a search miss, and may yield search performance metrics exceedingsearch engine implementations with hashing. Additionally, TSTs may alsoprovide advanced text search features, such as, e.g., wildcard searches,which may be useful in text search applications, such as, for example,whois, domain name resolution, Internet content search, etc.

[0057] In an embodiment, a TST may contain a sequence of nodes linkedtogether in a hierarchical relationship. A root node may be located atthe top of the tree, related child nodes and links may form branches,and leaf nodes may terminate the end of each branch. Each leaf node maybe associated with a particular search key, and each node on the path tothe leaf node may contain a single, sequential element of the key. Eachnode in the tree contains a comparison character, or split value, andthree pointers to other successive, or “child,” nodes in the tree. Thesepointers reference child nodes whose split values are less than, equalto, or greater than the node's split value. Searching the TST for aparticular key, therefore, involves traversing the tree from the rootnode to a final leaf node, sequentially comparing each element, orcharacter position, of the key with the split values of the nodes alongthe path. Additionally, a leaf node may also contain a pointer to a keyrecord, which may, in turn, contain at least one pointer to a terminaldata record containing the record data associated with the key (e.g., anIP address). Alternatively, the key record may contain the record datain its entirety. Record data may be stored in binary format, ASCII textformat, etc.

[0058] In an embodiment, database 700 may be organized as a TST,including a plurality of fixed-length search nodes 701, a plurality ofvariable-length key data records 702 and a plurality of variable-lengthterminal data records 703. Search nodes 701 may include various types ofinformation as described above, including, for example, a comparisoncharacter (or value) and position, branch node pointers and a keypointer. The size of the node pointers may generally be determined bythe number of nodes, while the size of the key pointers may generally bedetermined by the size of the variable-length key data set. Key datarecords 702 may contain key information and terminal data information,including, for example, pointers to terminal data records or embeddedrecord data, while terminal data records 703 may contain record data.

[0059] In an embodiment, each fixed-length search node may be 24 bytesin length. Search node 710, for example, may contain an eight-bitcomparison character (or byte value) 711, a 12-bit character (or byte)position 712, and a 12-bit node type/status (not shown for clarity);these data may be encoded within the first four bytes of the node. Thecomparison character 711 may be encoded within the first byte of thenode as depicted in FIG. 7, or, alternatively, character position 712may be encoded within the first 12 bits of the node in order to optimizeaccess to character position 712 using a simple shift operation. Thenext 12 bytes of each search node may contain three 32-bit pointers,i.e., pointer 713, pointer 714 and pointer 715, representing “lessthan,” “equal to,” and “greater than” branch node pointers,respectively. These pointers may contain a counter, or node index,rather than a byte-offset or memory address. For fixed-length searchnodes, the byte-offset may be calculated from the counter, or indexvalue, and the fixed-length, e.g., counter*length. The final four bytesmay contain a 40-bit key pointer 716, which may be a null valueindicating that a corresponding key data record does not exist (shown)or a pointer to an existing corresponding key data record (not shown),as well as other data, including, for example, a 12-bit key length and a12-bit pointer type/status field. Key pointer 716 may contain a byteoffset to the appropriate key data record, while the key length may beused to optimize search and insertion when eliminating one-way branchingwithin the TST. The pointer type/status field may contain informationused in validity checking and allocation data used in memory management.

[0060] In an embodiment, key data record 750 may include, for example, avariable-length key 753 and at least one terminal data pointer. Asdepicted in FIG. 7, key data record 750 includes two terminal datapointers: terminal data pointer 757 and terminal data pointer 758. Keydata record 750 may be prefixed with a 12-bit key length 751 and a12-bit terminal pointer count/status 752, and may include padding (notshown for clarity) to align the terminal data pointer 757 and terminaldata pointer 758 on an eight-byte boundary in memory 104. Terminal datapointer 757 and terminal data pointer 758 may each contain various data,such as, for example, terminal data type, length, status or data usefulin binary record searches. Terminal data pointer 757 and terminal datapointer 758 may be sorted by terminal data type for quicker retrieval ofspecific resource records (e.g., terminal data record 760 and terminaldata record 770). In another embodiment, key data record 740 may includeembedded terminal data 746 rather than, or in addition to, terminal datarecord pointers. For example, key data record 740 may include a keylength 741, a terminal pointer count 742, a variable-length key 743, thenumber of embedded record elements 744, followed by a record elementlength 745 (in bytes, for example) and embedded record data 746 (e.g., astring, a byte sequence, etc.) for each of the number of embedded recordelements 744.

[0061] In an embodiment, terminal data record 760, for example, mayinclude a 12-bit length 761, a 4-bit status, and a variable-lengthstring 762 (e.g., an IP address). Alternatively, variable length string762 may be a byte sequence. Terminal data record 760 may include paddingto align each terminal data record to an 8-byte boundary in memory 104.Alternatively, terminal data record 760 may include padding to a 4-byteboundary, or, terminal data record 760 may not include any padding.Memory management algorithms may determine, generally, whether terminaldata records 760 are padded to 8-byte, 4-byte, or 0-byte boundaries.Similarly, terminal data record 770 may include a 12-bit length 771, a4-bit status, and a variable-length string 772 (e.g., an IP address).

[0062] Generally, both search indices, such as TSTs, and data recordsmay be structured so that 8-byte pointers are located on 8-byteboundaries in memory. For example, key pointer 726 may contain an 8-byte(or less) pointer to key data record 740, and may be stored at a memoryaddress divisible by eight (i.e., an 8-byte boundary, or 8N). Similarly,both search indices, such as TSTs, and data records may be structured sothat 4-byte pointers are located on 4-byte boundaries in memory. Forexample, node branch pointer 724 may contain a 4-byte (or less) pointerto node 730, and may be stored at a memory address divisible by four(i.e., a 4-byte boundary, or 4N). Consequently, modifications todatabase 700 may conclude by updating a pointer to an aligned address inmemory using a single uninterruptible operation, including, for examplewriting a new pointer to the search index, such as a TST node, orwriting a new pointer to a data record.

[0063]FIG. 8 is a detailed block diagram that illustrates another datastructure architecture, according to an embodiment of the presentinvention. As above, database 800 may also be organized into a single,searchable representation of the data. Data set updates may becontinuously incorporated into database 800, and deletes ormodifications may be physically performed on the relevant databaserecords to free space within memory 104, for example, for subsequentadditions or modifications. The single, searchable representation scalesextremely well to large data set sizes and high search and update rates,and obviates the need to periodically recreate, propagate and reloadsnapshot files among multiple search engine computers.

[0064] Other search index structures are possible for accessing recorddata, In an embodiment, database 800 may use an alternative orderedsearch index, organized as an ordered access key tree (i.e., “OAKtree”). Database 800 mayinclude, for example, a plurality ofvariable-length search nodes 801, a plurality of variable-length keyrecords 802 and a plurality of variable-length terminal data records803. Search nodes 801 may include various types of information asdescribed above, such as, for example, search keys, pointers to othersearch nodes, pointers to key records, etc. In an embodiment, pluralityof search nodes 801 may include vertical and horizontal nodes containingfragments of search keys (e.g., strings), as well as pointers to othersearch nodes or key records. Vertical nodes may include, for example, atleast one search key, or character, pointers to horizontal nodes withinthe plurality of search nodes 801, pointers to key records within theplurality of key records 802, etc. Horizontal nodes may include, forexample, at least two search keys, or characters, pointers to verticalnodes within the plurality of search nodes 801, pointers to horizontalnodes within the plurality of search nodes 801, pointers to key recordswithin the plurality of key records 802, etc. Generally, vertical nodesmay include a sequence of keys (e.g., characters) representing a searchkey fragment (e.g., string), while horizontal nodes may include variouskeys (e.g., characters) that may exist at a particular position withinthe search key fragment (e.g., string).

[0065] In an embodiment, plurality of search nodes 801 may includevertical node 810, vertical node 820 and horizontal node 830. Verticalnode 810 may include, for example, a 2-bit node type 811 (e.g., “10”), a38-bit address 812, an 8-bit length 813 (e.g., “8”), an 8-bit firstcharacter 814 (e.g., “I”) and an 8-bit second character 815 (e.g.,“null”). In this example, address 812 may point to the next node in thesearch tree, i.e., vertical node 820. In an embodiment, 38-bit address812 may include a 1-bit terminal/nodal indicator and a 37-bit offsetaddress to reference one of the 8-byte words within a 1 Tbyte (˜10¹²byte) address space of memory 104. Accordingly, vertical node 810 may beeight bytes (64 bits) in length, and, advantageously, may be located onan 8-byte word boundary within memory 104. Generally, each vertical nodewithin plurality of search nodes 801 may be located on an 8-byte wordboundary within memory 104.

[0066] A vertical node may include a multi-character, search keyfragment (e.g., string). Generally, search keys without associated keydata records may be collapsed into a single vertical node to effectivelyreduce the number of vertical nodes required within plurality of searchnodes 801. In an embodiment, vertical node 810 may include eight bitsfor each additional character, above two characters, within the searchkey fragment, such as, for example, 8-bit characters 816-1, 816-2 . . .816-N (shown in phantom outline). Advantageously, vertical node 810 maybe padded to a 64-bit boundary within memory 104 in accordance with thenumber of additional characters located within the string fragment. Forexample, if nine characters are to be included within vertical node 810,then characters one and two may be assigned to first character 814 andsecond character 815, respectively, and 56 bits of additional characterinformation, corresponding to characters three through nine, may beappended to vertical node 810. An additional eight bits of padding maybe included to align the additional character information on an 8-byteword boundary.

[0067] Similarly, vertical node 820 may include, for example, a 2-bitnode type 821 (e.g., “10”), a 38-bit address 822, an 8-bit length 823(e.g., “8”), an 8-bit first character 824 (e.g., “a”) and an 8-bitsecond character 825 (e.g., “null”). In this example, address 822 maypoint to the next node in the search tree, i.e., horizontal node 830.Accordingly, vertical node 820 may be eight bytes in length, and,advantageously, may be located on an 8-byte word boundary within memory104. Of course, additional information may also be included withinvertical node 820 if required, as described above with reference tovertical node 810.

[0068] Horizontal node 830 may include, for example, a 2-bit node type831 (e.g., “01”), a 38-bit first address 832, an 8-bit address count 833(e.g., 2), an 8-bit first character 834 (e.g., “-”), an 8-bit lastcharacter 835 (e.g., “w”), a variable-length bitmap 836 and a 38-bitsecond address 837. In this example, first character 834 may include asingle character, “-” representing the search key fragment “la” definedby vertical nodes 810 and 820, while last character 831 may include asingle character “w,” representing the search key fragment “law” definedby vertical nodes 810 and 820, and the last character 835 of horizontalnode 830. First address 832 may point to key data record 840, associatedwith the search key fragment “la,” while second address 837 may point tokey data record 850 associated with the search key-fragment “law.”

[0069] Bitmap 836 may advantageously indicate which keys (e.g.,characters) are referenced by horizontal node 830. A “1” within a bitposition in bitmap 836 indicates that the key, or character, isreferenced by horizontal node 830, while a “0” within a bit position inbitmap 836 may indicate that the key, or character, is not referenced byhorizontal node 830. Generally, the length of bitmap 836 may depend uponthe number of sequential keys, or characters, between first character834 and last character 835, inclusive of these boundary characters. Forexample, if first character 834 is “a” and last character 835 is “z,”then bitmap 836 may be 26 bits in length, where each bit corresponds toone of the characters between, and including, “a” through “z.” In thisexample, additional 38-bit addresses would be appended to the end ofhorizontal node 830, corresponding to each of the characters representedwithin bitmap 836. Each of these 38-bit addresses, as well as bitmap836, may be padded to align each quantity on an 8-byte word boundarywithin memory 104. In an embodiment, the eight-bit ASCII character setmay be used as the search key space so that bitmap 836 may be as long as256 bits (i.e., 28 bits or 32 bytes). In the example depicted in FIG. 8,due to the special reference character “·” and address count 833 of “2,”bitmap 836 may be two bits in length and may include a “1” in each bitposition corresponding to last character 835.

[0070] In an embodiment, and as discussed with reference to key datarecord 750 (FIG. 7), key data record 850 may include, for example, avariable-length key 853 and at least one terminal data pointer. Asdepicted in FIG. 8, key data record 850 includes two terminal datapointers, terminal data pointer 857 and terminal data pointer 858. Keydata record 850 may be prefixed with a 12-bit key length 851 and a12-bit terminal pointer count/status 852, and may include padding (notshown for clarity) to align the terminal data pointer 857 and terminaldata pointer 858 on an 8-byte boundary in memory 104. Terminal datapointer 857 and terminal data pointer 858 may each contain a 10-bitterminal data type and other data, such as, for example, length, statusor data useful in binary record searches. Terminal data pointer 857 andterminal data pointer 858 may be sorted by terminal data type forquicker retrieval of specific resource records (e.g., terminal datarecord 860 and terminal data record 870).

[0071] In another embodiment, and as discussed with reference to keydata record 740 (FIG. 7), key data record 840 may include embeddedterminal data 846 rather than a terminal data record pointer. Forexample, key data record 840 may include a key length 841, a terminalpointer count 842, a variable-length key 843, the number of embeddedrecord elements 844, followed by a record element length 845 (in bytes,for example) and embedded record data 846 (e.g., a string, a bytesequence, etc.) for each of the number of embedded record elements 844.

[0072] In another embodiment, and as discussed with reference toterminal data record 760 (FIG. 7), terminal data record 860, forexample, may include a 12-bit length 861, a 4-bit status, and avariable-length string 862 (e.g., an IP address). Alternatively,variable length string 862 may be a byte sequence. Terminal data record860 may include padding (not shown for clarity) to align each terminaldata record to an 8-byte boundary in memory 104. Alternatively, terminaldata record 860 may include padding (not shown for clarity) to a 4-byteboundary, or, terminal data record 860 may not include any padding.Memory management algorithms may determine, generally, whether terminaldata records 760 are padded to 8-byte, 4-byte, or 0-byte boundaries.Similarly, terminal data record 870 may include a 12-bit length 871, a4-bit status, and a variable-length string 872 (e.g., an IP address).

[0073] Generally, both search indices, such as OAK trees, and datarecords may be structured so that 8-byte pointers are located on 8-byteboundaries in memory. For example, vertical node 810 may contain an8-byte (or less) pointer to vertical node 820, and may be stored at amemory address divisible by eight (i.e., an 8-byte boundary, or 8N).Similarly, both search indices, such as OAK trees, and data records maybe structured so that 4-byte pointers are located on 4-byte boundariesin memory. Consequently, modifications to database 800 may conclude byupdating a pointer to an aligned address in memory using a singleuninterruptible operation, including, for example writing a new pointerto the search index, such as an OAK trees node, or writing a new pointerto a data record.

[0074] The various embodiments discussed above with reference to FIG. 8present many advantages. For example, an OAK tree data structure isextremely space efficient and 8-bit clean. Regular expression searchesmay be used to search vertical nodes containing multi-character stringfragments, since the 8-bit first character (e.g., first character 814),the 8-bit second character (e.g., second character 8-15) and anyadditional 8-bit characters (e.g., additional characters 816-1 . . .816-N) may be contiguously located within the vertical node (e.g.,vertical node 810). Search misses may be discovered quickly, and, nomore than N nodes may need to be traversed to search for an N-characterlength search string.

[0075]FIG. 9 is a top level flow diagram that illustrates a method forsearching and concurrently updating a database without the use ofdatabase locks or access controls, according to embodiments of thepresent invention.

[0076] An update thread and a plurality of search threads may be created(900). In an embodiment, system 100 may spawn a single update thread toincorporate updates to the local database received, for example, fromOLTP server 140-1 over WAN 124. In other embodiments, system 100 mayreceive updates from OLTP servers 140-1 . . . 140-S over WAN 124, andfrom plurality of network computers 120-1 . . . 120-N over WAN 124 orLAN 122. System 100 may also spawn a search thread in response to eachsession request received from the plurality of network computers 120-1 .. . 120-N. For example, a manger thread may poll one or more controlports, associated with one or more network interfaces 114-1 . . . 114-O,for session requests transmitted from the plurality of network computers120-1 . . . 120-N. Once a session request from a particular networkcomputer 120-1 . . . 120-N is received, the manage thread may spawn asearch thread and associate the search thread with that particularnetwork computer (e.g., PE).

[0077] In an alternative embodiment, system 100 may spawn a number ofsearch threads without polling for session requests from the pluralityof network computers 120-1 . . . 120-N. In this embodiment, the searchthreads may not be associated with particular network computers and maybe distributed evenly among the plurality of processors 102-1 . . .102-P. Alternatively, the search threads may execute on a subset of theplurality of processors 102-1 . . . 102-P. The number of search threadsmay not necessarily match the number of network computers (e.g., N).

[0078] A plurality of search queries may be received (910) over thenetwork. In an embodiment, plurality of network computers 120-1 . . .120-N may send the plurality of search queries to system 100 over LAN122, or, alternatively, WAN 124. The plurality of search queries maycontain, for example, a search term or key, as well as state informationthat may be associated with each query (e.g., query source address,protocol type, etc.).State information may be explicitly maintained bysystem 100, or, alternatively, a state information handle may beprovided. In a preferred embodiment, each of the plurality of networkcomputers 120-1 . . . 120-N may multiplex a predetermined number ofsearch queries into a single network packet for transmission to system100 (e.g., a Request SuperPacket 220 as depicted in FIG. 2).

[0079] Each search query may be assigned (920) to one of the searchthreads for processing. In an embodiment, each search thread may beassociated with one of the plurality of network computers 120-1 . . .120-N and all of the search queries received from that particularnetwork computer may be assigned (920) to the search thread. In otherwords, one search thread may process all of the search queries arrivingfrom a single network computer (e.g., a single PE). In an embodiment,each search thread may extract individual search queries from a single,multiplexed network packet (e.g., Request SuperPacket 220 as depicted inFIG. 2), or, alternatively, the extraction may be performed by adifferent process or thread.

[0080] In another embodiment, the search queries received from each ofthe plurality of network computers 120-1 . . . 120-N may be assigned(920) to different search threads. In this embodiment, the multi-threadassignment may be based on an optimal distribution function which mayincorporate various system parameters including, for example, processorloading. Of course, the assignment of search queries to search threadsmay change over time, based upon various system parameters, includingprocessor availability, system component performance, etc. Variousmechanisms may be used to convey search queries to assigned searchthreads within system 100, such as, for example, shared memory,inter-process messages, tokens, semaphores, etc.

[0081] Each search thread may search (930) the database based on theassigned search queries. Searching the database may depend upon theunderlying structure of the database.

[0082] Referring to the database embodiment illustrated in FIG. 4,database 400 may be searched (930) for the search key. The data record(e.g., database record 420) corresponding to the search key may then bedetermined. Referring to the database embodiment illustrated in FIG. 5,look-aside file 520 may first be searched (930) for the search key, and,if a match is not determined, then snapshot file 510 may be searched(930). The data record corresponding to the search key may then bedetermined.

[0083] Referring to the database embodiment illustrated in FIG. 6,domain name data 610 may first be searched (930) for the search key, andthen the resource data within name server data 630 corresponding to thesearch key, may then be determined. For example, for the “la.com” searchkey, a match may be determined with domain name record 620 in domainname data 610. The appropriate information may be extracted, including,for example, name server pointer 626. Then, the appropriate name serverrecord 640 may be indexed using name server pointer 626, and name servernetwork address 647 may be extracted.

[0084] Referring to the database embodiment illustrated in FIG. 7, theTST may be searched (930) for the search key, from which the resourcedata may be determined. For example, for the “law.com” search key,search nodes 701 may be searched (930), and a match determined with node730. Key pointer 736 may be extracted, from which the key data record750 may be determined. The number of terminal data pointers 752 may thenbe identified and each terminal data pointer may be extracted. Forexample, terminal data pointer 757 may reference terminal data record760 and terminal data pointer 758 may reference and terminal data record770. The variable-length resource data, e.g., name server networkaddress 762 and name server network address 772, may then be extractedfrom each terminal data record using the length 761 and 771,respectively.

[0085] Referring to the database embodiment illustrated in FIG. 8, theOAK tree may be searched (930) for the search key, from which theresource data may be determined. For example, for the “law.com” searchkey, search nodes 801 may be searched (930), and a match determined withnode 830. Second address 837 may be extracted, from which the key datarecord 850 may be determined. The number of terminal data pointers 852may then be identified and each terminal data pointer may be extracted.For example, terminal data pointer 857 may reference terminal datarecord 860 and terminal data pointer 858 may reference and terminal datarecord 870. The variable-length resource data, e.g., name server networkaddress 862 and name server network address 872, may then be extractedfrom each terminal data record using the length 861 and 871,respectively.

[0086] Each search thread may create (940) a plurality of search repliescorresponding to the assigned search queries. If a match is not foundfor a particular search key, the reply may include an appropriateindication, such as, for example the null character. For domain nameresolution, for example, a search key might be “law.com” and thecorresponding resource data might be “180.1.1.1”. More than one nameserver network address may be associated with a search key, in whichcase, more than one name server network address may be determined.

[0087] The replies may be sent (950) over the network. In an embodiment,each search thread may multiplex the appropriate replies into a singlenetwork packet (e.g., Response SuperPacket 240) corresponding to thesingle network packet containing the original queries (e.g., RequestSuperPacket 220). Alternatively, a different process or thread maymultiplex the appropriate replies into the single network packet. Theresponse network packet may then be sent (950) to the appropriatenetwork computer within the plurality of network computers 120-1 . . .120-N via LAN 122, or alternatively, WAN 124. In one embodiment, theresponse packets may be sent to the same network computer from which therequest packets originated, while in another embodiment, the responsepackets may be sent to a different network computer.

[0088] The update thread may receive (960) new information over thenetwork. In an embodiment, new information may be sent, for example,from OLTP server 140-1 to system 100 over WAN 124. In other embodiments,system 100 may receive updates from OLTP servers 140-1 . . . 140-S overWAN 124, and from plurality of network computers 120-1 . . . 120-N overWAN 124 or LAN 122. In the DNS resolution embodiment, for example, thenew information may include new domain name data, new name server data,a new name server for an existing domain name, etc. Alternatively, thenew information may indicate that a domain name, name server, nameserver network address, etc., may be deleted from the database.Generally, any information contained within the database may be added,modified or deleted, as appropriate.

[0089] The update thread may create (970) a new element in the databasecontaining the new information. Generally, modifications to informationcontained within an existing element of the database are incorporated bycreating a new element based on the existing element and then modifyingthe new element to include the new information. During this process, thenew element may not be visible to the search threads or processescurrently executing on system 100 until the new element has beencommitted to the database. Generally, additions to the database may beaccomplished in a similar fashion, without necessarily using informationcontained within an existing element. In one embodiment, the deletion ofan existing element from the database may be accomplished by adding anew, explicit “delete” element to the database. In another embodiment,the deletion of an existing element from the database may beaccomplished by overwriting a pointer to the existing element with anappropriate indicator (e.g., a null pointer, etc.). In this embodiment,the update thread does not create a new element in the databasecontaining new information.

[0090] In the DNS resolution embodiment, for example, the newinformation may include a new domain name to be added to the database.In this example, for simplicity, the new domain name may reference anexisting name server. Referring to FIG. 6, memory space for a new domainname record 615 may be allocated from a memory pool associated with thedomain name records 611, or, alternatively, from a general memory poolassociated with domain name data 610. The new domain name may benormalized and copied to the new domain name record 615, and a pointerto an existing name server (e.g., name server record 655) may bedetermined and copied to the new domain name record 615. Otherinformation may be calculated and added to new domain name record 615,such as, for example, a number of name servers, a chain pointer, etc. Inmore complicated examples, the new information may include a new searchkey with corresponding resource data.

[0091] Referring to FIG. 7, a new search node 705, as well as a new keydata record 780, may first be created. In this example, the new searchnode 705 may include a comparison character (“m”), in the firstposition, that is greater than the comparison character (“l”), in thefirst position, of existing search node 710. Consequently, search node705 may be inserted in the TST at the same “level” (i.e., 1^(st)character position) as search node 710. Before search node 705 iscommitted to the database, the 4-byte “greater than” pointer 715 ofsearch node 710 may contain a “null” pointer. Search node 705 may alsoinclude a 4-byte key pointer 706 which may contain a 40-bit pointer tothe new key data record 780. Key data record 780 may include a keylength 781 (e.g., “5”) and type 782 (e.g., indicating embedded resourcedata), a variable length key 783 (e.g., “m.com”), a number of embeddedresources 784 (e.g., “1”), a resource length 785 (e.g., “9”), and avariable-length resource string 786 or byte sequence (e.g.,“180.1.1.1”). In an embodiment, memory space may be allocated for searchnode 705 from a memory pool associated with TST nodes 701, while memoryspace may be allocated for the key data record 770 from a memory poolassociated with plurality of key data records 702.

[0092] Referring to FIG. 8, a new search node 890, as well as a new keydata record 880, may first be created. In this example, the new searchnode 890 may be a horizontal node including, for example, a 2-bit nodetype 891 (e.g., “01”), a 38-bit first address 892, an 8-bit addresscount 893 (e.g., 2), an 8-bit first character 894 (e.g., ”l”), an 8-bitlast character 895 (e.g., “m”), a variable-length bitmap 896 and a38-bit second address 897. First address 892 may point to vertical node820, the next vertical node in the “l . . . ” search string path, whilesecond address 897 may point to key data record 880 associated with thesearch key fragment “m.” Key data record 880 may include a key length881 (e.g., “5”) and type 882 (e.g., indicating embedded resource data),a variable length key 883 (e.g., “m.com”), a number of embeddedresources 884 (e.g., “1”), a resource length 885 (e.g., “9”), and avariable-length resource string 886 or byte sequence (e.g.,“180.1.1.1”). In an embodiment, memory space may be allocated for searchnode 890 from a memory pool associated with plurality of search nodes801, while memory space may be allocated for key data record 880 from amemory pool associated with plurality of key data records 802.

[0093] The update thread may write (980) a pointer to the database usinga single uninterruptible operation. Generally, a new element may becommitted to the database, (i.e., become visible to the search threads,or processes), instantaneously by writing a pointer to the new elementto the appropriate location within the database. As discussed above, theappropriate location may be aligned in memory, so that the singleoperation includes a single store instruction of an appropriate length.In an embodiment, an existing element may be deleted from the database(i.e., become invisible to the search threads, or processes) byinstantaneously overwriting a pointer to the existing element with anappropriate indicator (e.g., a “null” pointer, etc.). Again, theappropriate location may be aligned in memory, so that the singleoperation includes a single store instruction of an appropriate length.

[0094] Referring to FIG. 6, an 8-byte pointer corresponding to domainname record 620 may be written to hash table 612 (e.g., element 613)Importantly, the hash table entries are aligned on 8-byte boundaries inmemory 104 to ensure that a single, 8-byte store instruction is used toupdate this value. Referring to FIG. 7, a 4-byte pointer correspondingto the new search node 705 may be written to the 4-byte “greater-than”node pointer 715 within search node 710. Importantly, the node pointer715 is aligned on a 4-byte boundary in memory 104 to ensure that asingle, 4-byte store instruction may be used to update this value.Referring to FIG. 8, plurality of search nodes 801 may also include atop-of-tree address 899, which may be aligned on an 8-byte word boundaryin memory 104 and reference the first node within plurality of searchnodes 801 (i.e., e.g., vertical node 810). An 8-byte pointercorresponding to the new search node 890 may be written to thetop-of-tree address 899 using a single store instruction. In each ofthese embodiments, just prior to the store instruction, the new data arenot visible to the search threads, while just after the storeinstruction, the new data are visible to the search threads. Thus, witha single, uninterruptible operation, the new data may be committed tothe database without the use of database locks or access controls.

[0095] In an embodiment, the update thread may physically delete (990)an existing element after the pointer is written (980) to the database.Advantageously, for existing elements of the database that are modifiedor deleted, the physical deletion of these elements from memory 104 maybe delayed to preserve consistency of in-progress searches. For example,after an existing element has been modified and the corresponding newelement committed to the database, the physical deletion of the existingelement from memory 104 may be delayed so that existing search threadsthat have a result, acquired just before the new element was committedto the database, may continue to use the previous state of the data.Similarly, after an existing element has been deleted from the database,the physical deletion of the existing element from memory 104 may bedelayed so that existing search threads that have a result, acquiredjust before the existing element was deleted from the database, maycontinue to use the previous state of the data. The update thread mayphysically delete (990) the existing element after all the searchthreads that began before the existing element was modified, or deleted,have finished.

[0096] Potential complications may arise from the interaction of methodsassociated with embodiments of the present invention and variousarchitectural characteristics of system 100. For example, the processoron which the update thread is executing (e.g., processor 102-1, 102-2,etc.) may include hardware to support out-of-order instructionexecution. In another example, system 100 may include an optimizingcompiler which may produce a sequence of instructions, associated withembodiments of the present invention, that have been optimallyrearranged to exploit the parallelism of the processor's internalarchitecture (e.g., processor 102-1, 102-2, etc.). Many othercomplications may readily be admitted by one skilled in the art. Datahazards arising from out-of-order instruction execution may beeliminated, for example, by creating dependencies between the creation(970) of the new element and the pointer write (980) to the database.

[0097] In one embodiment, these dependencies may be established byinserting additional arithmetic operations, such as, for example, anexclusive OR (XOR) instruction, into the sequence of instructionsexecuted by processor 102-1 to force the execution of the instructionsassociated with the creation (970) of the new element to issue, orcomplete, before the execution of the pointer write (980) to thedatabase. For example, the contents of the location in memory 104corresponding to the new element may be XOR'ed with the contents of thelocation in memory 104 corresponding to the pointer to the new element.Subsequently, the address of the new element may be written (980) tomemory 104 to commit the new element to the database. Numerous methodsto overcome these complications may be readily discernable to oneskilled in the art.

[0098] Several embodiments of the present invention are specificallyillustrated and described herein. However, it will be appreciated thatmodifications and variations of the present invention are covered by theabove teachings and within the purview of the appended claims withoutdeparting from the spirit and intended scope of the invention.

What is claimed is:
 1. A multi-threaded network database system,comprising: at least one processor coupled to a network; and a memorycoupled to the processor, the memory including a database andinstructions adapted to be executed by the processor to: create anupdate thread and a plurality of search threads; assign each of aplurality of search queries, received over the network, to one of theplurality of search threads; for each search thread: search the databaseaccording to the assigned search queries, create a plurality of searchreplies corresponding to the assigned search queries, and send theplurality of search replies over the network; and for the update thread:create a new element according to new information received over thenetwork, and without restricting access to the database for theplurality of search threads, write a pointer to the new element to thedatabase using a single uninterruptible operation.
 2. The system ofclaim 1, wherein the single uninterruptible operation is a storeinstruction.
 3. The system of claim 1, further comprising: for theupdate thread: physically delete an existing element from the memoryafter the pointer is written to the database.
 4. The system of claim 2,wherein the store instruction writes four bytes to a memory addresslocated on a four byte boundary.
 5. The system of claim 2, wherein thestore instruction writes eight bytes to a memory address located on aneight byte boundary.
 6. The system of claim 2, wherein the processor hasa word size of at least n-bytes, the memory has a width of at leastn-bytes and the store instruction writes n-bytes to a memory addresslocated on an n-byte boundary.
 7. The system of claim 1, wherein theplurality of search queries are received within a single network packet.8. The system of claim 1, wherein the plurality of search replies aresent within a single network packet.
 9. The system of claim 1, whereinsaid restricting access includes database locking.
 10. The system ofclaim 1, wherein said restricting access includes spin locking.
 11. Thesystem of claim 10, wherein said spin locking includes the use of atleast one semaphore.
 12. The system of claim 11, wherein said semaphoreis a mutex semaphore.
 13. The system of claim 1, further comprising aplurality of processors and a symmetric multi-processing operatingsystem.
 14. The system of claim 13, wherein the plurality of searchthreads perform at least 100,000 searches per second.
 15. The system ofclaim 14, wherein the update thread performs at least 10,000 updates persecond.
 16. The system of claim 15, wherein the update thread performsbetween 50,000 and 130,000 updates per second.
 17. The system of claim1, wherein the pointer to the new element is written to a search index.18. The system of claim 17, wherein the search index is a TST.
 19. Thesystem of claim 17, wherein the search index is a hash table.
 20. Thesystem of claim 1, wherein the pointer to the new element is written toa data record within the database.
 21. A method for searching andconcurrently updating a database, comprising: creating an update threadand a plurality of search threads; assigning each of a plurality ofsearch queries, received over a network, to one of the plurality ofsearch threads; for each search thread: searching the database accordingto the assigned search queries, creating a plurality of search repliescorresponding to the assigned search queries, and sending the pluralityof search replies over the network; and for the update thread: creatinga new element according to new information received over the network,and without restricting access to the database for the plurality ofsearch threads, writing a pointer to the new element to the databaseusing a single uninterruptible operation.
 22. The method of claim 21,wherein the single uninterruptible operation is a store instruction. 23.The method of claim 21, further comprising: for the update thread:physically deleting an existing element after the pointer is written tothe database.
 24. The method of claim 22, wherein the store instructionwrites four bytes to a memory address located on a four byte boundary.25. The method of claim 22, wherein the store instruction writes eightbytes to a memory address located on an eight byte boundary.
 26. Themethod of claim 21, wherein the plurality of search queries are receivedwithin a single network packet.
 27. The method of claim 21, wherein theplurality of search replies are sent within a single network packet. 28.The method of claim 21, wherein said restricting access includesdatabase locking.
 29. The method of claim 21, wherein said restrictingaccess includes spin locking.
 30. The method of claim 29, wherein saidspin locking includes the use of at least one semaphore.
 31. The methodof claim 30, wherein said semaphore is a mutex semaphore.
 32. The methodof claim 21, wherein the plurality of search threads perform at least100,000 searches per second.
 33. The method of claim 32, wherein theupdate thread performs at least 10,000 updates per second.
 34. Themethod of claim 33, wherein the update thread performs between 50,000and 130,000 updates per second.
 35. The method of claim 21, wherein thepointer to the new element is written to a search index.
 36. The methodof claim 35, wherein the search index is a TST.
 37. The method of claim21, wherein the pointer to the new element is written to a data recordwithin the database.
 38. A computer readable medium includinginstructions adapted to be executed by at least one processor toimplement a method for searching and concurrently updating a database,the method comprising: creating an update thread and a plurality ofsearch threads; assigning each of a plurality of search queries,received over a network, to one of the plurality of search threads; foreach search thread: searching a database according to the assignedsearch queries, creating a plurality of search replies corresponding tothe assigned search queries, and sending the plurality of search repliesover the network; and for the update thread: creating a new elementaccording to new information received over the network, and withoutrestricting access to the database for the plurality of search threads,writing a pointer to the new element to the database using a singleuninterruptible operation.
 39. The computer readable medium of claim 38,wherein the single uninterruptible operation is a store instruction. 40.The computer readable medium of claim 38, wherein said method furthercomprising: for the update thread: physically deleting an existingelement after the pointer is written to the database.
 41. The computerreadable medium of claim 39, wherein the store instruction writes fourbytes to a memory address located on a four byte boundary.
 42. Thecomputer readable medium of claim 39, wherein the store instructionwrites eight bytes to a memory address located on an eight byteboundary.
 43. The computer readable medium of claim 38, wherein theplurality of search queries are received within a single network packet.44. The computer readable medium of claim 38, wherein the plurality ofsearch replies are sent within a single network packet.
 45. The computerreadable medium of claim 38, wherein said restricting access includesdatabase locking.
 46. The computer readable medium of claim 38, whereinsaid restricting access includes spin locking.
 47. The computer readablemedium of claim 46, wherein said spin locking includes the use of atleast one semaphore.
 48. The computer readable medium of claim 47,wherein said semaphore is a mutex semaphore.
 49. The computer readablemedium of claim 38, wherein the pointer to the new element is written toa search index.
 50. The computer readable medium of claim 49, whereinthe search index is a TST.
 51. The computer readable medium of claim 38,wherein the pointer to the new element is written to a data recordwithin the database.
 52. A method for searching and updating a database,comprising: receiving a plurality of search queries over a network;searching the database; sending a plurality of search replies over thenetwork; and while searching the database, modifying the database toincorporate new information received over the network, including:creating a new element based on the new information, and without lockingthe database, writing a pointer to the new element to the database usinga single uninterruptible operation.
 53. The method of claim 52, whereinthe single uninterruptible operation is a store instruction.
 54. Themethod of claim 52, further comprising: physically deleting an existingelement after the pointer is written to the database.
 55. The method ofclaim 53, wherein the store instruction writes four bytes to a memoryaddress located on a four byte boundary.
 56. The method of claim 53,wherein the store instruction writes eight bytes to a memory addresslocated on an eight byte boundary.
 57. The method of claim 53, whereinthe store instruction writes n-bytes to a memory address located on ann-byte boundary.
 58. The method of claim 52, wherein the plurality ofsearch queries are received within a first network packet and theplurality of search replies are sent within a second network packet. 59.The method of claim 52, wherein said searching the database includes atleast 100,000 searches per second.
 60. The method of claim 52, whereinsaid modifying the database includes at least 10,000 modifications persecond.
 61. The method of claim 60, wherein the update thread performsbetween 50,000 and 130,000 updates per second.
 62. The method of claim52, wherein the pointer to the new element is written to a search index.63. The method of claim 52, wherein the pointer to the new element iswritten to a data record within the database.
 64. A system for searchingand updating a database, comprising: at least one processor coupled to anetwork; and a memory coupled to the processor, the memory storing adatabase and instructions adapted to be executed by the processor to:create a new element based on new information received over the network,and without restricting search access to the database, write a pointerto the new element to the database using a single uninterruptibleoperation.
 65. The system of claim 64, wherein the singleuninterruptible operation is a store instruction.
 66. The system ofclaim 64, wherein the instructions are further adapted to: physicallydelete an existing element after the pointer is written to the database.67. The system of claim 65, wherein the store instruction writes fourbytes to a memory address located on a four byte boundary.
 68. Thesystem of claim 65, wherein the store instruction writes eight bytes toa memory address located on an eight byte boundary.
 69. The system ofclaim 65, wherein the processor has a word size of at least n-bytes, thememory has a width of at least n-bytes and the store instruction writesn-bytes to a memory address located on an n-byte boundary.
 70. Thesystem of claim 64, wherein said restricting search access includesdatabase locking.
 71. The system of claim 64, wherein said restrictingsearch access includes spin locking.
 72. The system of claim 71, whereinsaid spin locking includes the use of at least one semaphore.
 73. Thesystem of claim 72, wherein said semaphore is a mutex semaphore.
 74. Thesystem of claim 64, wherein the pointer to the new element is written toa search index.
 75. The system of claim 64, wherein the search index isa TST.
 76. The system of claim 64, wherein the pointer to the newelement is written to a data record within the database.
 77. A methodfor updating a non-concurrency controlled ternary search tree having aplurality of nodes, comprising: creating a new node; and updating apointer within one of the plurality of nodes to reference the new nodeusing a single store instruction.
 78. The method of claim 77, whereinthe single uninterruptible operation is a store instruction.
 79. Themethod of claim 78, wherein the store instruction writes four bytes to amemory address located on a four byte boundary.
 80. The method of claim78, wherein the store instruction writes eight bytes to a memory addresslocated on an eight byte boundary.
 81. The method of claim 78, whereinthe store instruction writes n-bytes to a memory address located on ann-byte boundary.
 82. The method of claim 77, wherein said updatingincludes at least 10,000 updates per second.
 83. The method of claim 82,wherein said updating includes between 50,000 and 130,000 updates persecond.
 84. A method for updating a non-concurrency controlled orderedaccess key search tree having a plurality of vertical and horizontalnodes, comprising: creating a new node; and updating a pointer withinone of the plurality of nodes to reference the new node using a singlestore instruction.
 85. The method of claim 84, wherein the singleuninterruptible operation is a store instruction.
 86. The method ofclaim 85, wherein the store instruction writes four bytes to a memoryaddress located on a four byte boundary.
 87. The method of claim 85,wherein the store instruction writes eight bytes to a memory addresslocated on an eight byte boundary.
 88. The method of claim 85, whereinthe store instruction writes n-bytes to a memory address located on ann-byte boundary.
 89. The method of claim 84, wherein said updatingincludes at least 10,000 updates per second.
 90. The system of claim 89,wherein said updating includes between 50,000 and 130,000 updates persecond.
 91. A method for updating a non-locking linked list having afirst element, a second element and a third element, the first elementhaving a first pointer to the second element, and the second elementhaving a second pointer to the third element, comprising: creating a newelement having a pointer to the third element; and updating the firstpointer to reference the new element using a single store instruction.92. The method of claim 91, wherein the single uninterruptible operationis a store instruction.
 93. The method of claim 92, wherein the storeinstruction writes four bytes to a memory address located on a four byteboundary.
 94. The method of claim 92, wherein the store instructionwrites eight bytes to a memory address located on an eight byteboundary.
 95. The method of claim 92, wherein the store instructionwrites n-bytes to a memory address located on an n-byte boundary. 96.The method of claim 91, wherein said updating includes at least 10,000updates per second.
 97. The method of claim 97, wherein said updatingincludes between 50,000 and 130,000 updates per second.
 98. A method forsearching and concurrently updating a database, comprising: creating anupdate thread and a plurality of search threads; assigning each of aplurality of search queries, received over the network, to one of theplurality of search threads; for each search thread: searching thedatabase according to the assigned search queries, creating a pluralityof search replies corresponding to the assigned search queries, andsending the plurality of search replies over the network; and for theupdate thread: without restricting access to the database for theplurality of search threads, writing a pointer to an existing element tothe database using a single uninterruptible operation.
 99. The method ofclaim 98, wherein the pointer comprises a null pointer.
 100. The methodof claim 98, further comprising: for the update thread: physicallydeleting the existing element after the pointer is written to thedatabase.